Search CORE

Recommended from our members

QSAR-derived affinity fingerprints (part 1): fingerprint construction and modeling performance for similarity searching, bioactivity classification and scaffold hopping.

Author: Bender A
Cortés-Ciriano I
Dehaen W
Kříž P
Svozil D
Tetko IV
van Westen GJP
Škuta C
Publication venue: J Cheminform
Publication date: 29/05/2021
Field of study

Funder: FP7 People: Marie-Curie Actions; doi: http://dx.doi.org/10.13039/100011264; Grant(s): 238701, 238701An affinity fingerprint is the vector consisting of compound's affinity or potency against the reference panel of protein targets. Here, we present the QAFFP fingerprint, 440 elements long in silico QSAR-based affinity fingerprint, components of which are predicted by Random Forest regression models trained on bioactivity data from the ChEMBL database. Both real-valued (rv-QAFFP) and binary (b-QAFFP) versions of the QAFFP fingerprint were implemented and their performance in similarity searching, biological activity classification and scaffold hopping was assessed and compared to that of the 1024 bits long Morgan2 fingerprint (the RDKit implementation of the ECFP4 fingerprint). In both similarity searching and biological activity classification, the QAFFP fingerprint yields retrieval rates, measured by AUC (~ 0.65 and ~ 0.70 for similarity searching depending on data sets, and ~ 0.85 for classification) and EF5 (~ 4.67 and ~ 5.82 for similarity searching depending on data sets, and ~ 2.10 for classification), comparable to that of the Morgan2 fingerprint (similarity searching AUC of ~ 0.57 and ~ 0.66, and EF5 of ~ 4.09 and ~ 6.41, depending on data sets, classification AUC of ~ 0.87, and EF5 of ~ 2.16). However, the QAFFP fingerprint outperforms the Morgan2 fingerprint in scaffold hopping as it is able to retrieve 1146 out of existing 1749 scaffolds, while the Morgan2 fingerprint reveals only 864 scaffolds

Apollo (Cambridge)

Computational exploration of molecular receptive fields in the olfactory bulb reveals a glomerulus-centric chemical map

Author: A Nagashima
A. Yablonka
B Malnic
BA Johnson
DB Chklovskii
DB Turner
EI Knudsen
ER Soucy
Eva M. Neuhaus
F Pedregosa
H Matsumoto
H Saito
IV Tetko
J Li
J Soelter
JH Kaas
JH Kaas
JL Pluznick
K Grill-Spector
K Miyamichi
K Mori
KJ Ressler
L Belluscio
L Ma
M Meister
M Schmuker
O Baud
PI Ezeh
R Haddad
R Vassar
R Vincis
RC Araneda
S Conzelmann
S Gabler
S Katada
SE Repicky
SL Sullivan
SM Boyle
T Abaffy
T Bozza
T Bozza
T Bozza
T Sato
TJ Imig
TP Hettinger
V Consonni
Verena Bautze
W Härdle
X Grosmaitre
Y Oka
Z Peterlin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 09/01/2020
Field of study

© The Author(s) 2020. This article is licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.Progress in olfactory research is currently hampered by incomplete knowledge about chemical receptive ranges of primary receptors. Moreover, the chemical logic underlying the arrangement of computational units in the olfactory bulb has still not been resolved. We undertook a large-scale approach at characterising molecular receptive ranges (MRRs) of glomeruli in the dorsal olfactory bulb (dOB) innervated by the MOR18-2 olfactory receptor, also known as Olfr78, with human ortholog OR51E2. Guided by an iterative approach that combined biological screening and machine learning, we selected 214 odorants to characterise the response of MOR18-2 and its neighbouring glomeruli. We found that a combination of conventional physico-chemical and vibrational molecular descriptors performed best in predicting glomerular responses using nonlinear Support-Vector Regression. We also discovered several previously unknown odorants activating MOR18-2 glomeruli, and obtained detailed MRRs of MOR18-2 glomeruli and their neighbours. Our results confirm earlier findings that demonstrated tunotopy, that is, glomeruli with similar tuning curves tend to be located in spatial proximity in the dOB. In addition, our results indicate chemotopy, that is, a preference for glomeruli with similar physico-chemical MRR descriptions being located in spatial proximity. Together, these findings suggest the existence of a partial chemical map underlying glomerular arrangement in the dOB. Our methodology that combines machine learning and physiological measurements lights the way towards future high-throughput studies to deorphanise and characterise structure-activity relationships in olfaction.Peer reviewe

University of Hertfordshire Research Archive

Could the clinical interpretability of subgroups detected using clustering methods be improved by using a novel two-stage approach?

Author: A Eye von
AD Hingorani
AE Westman
CK Jorgensen
D Amirall
E Vigneau
HW Christensen
HW Christensen
IV Tetko
JA Hayden
JC Hill
JM Beneciuk
JM Fritz
JM Schellingerhout
K Kroenke
LM Collins
MJ Stochkendahl
MJ Stochkendahl
NE Foster
O Takahashi
T Pincus
TG McGinn
World Health Organisation
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Background: Recognition of homogeneous subgroups of patients can usefully improve prediction of their outcomes and the targeting of treatment. There are a number of research approaches that have been used to recognise homogeneity in such subgroups and to test their implications. One approach is to use statistical clustering techniques, such as Cluster Analysis or Latent Class Analysis, to detect latent relationships between patient characteristics. Influential patient characteristics can come from diverse domains of health, such as pain, activity limitation, physical impairment, social role participation, psychological factors, biomarkers and imaging. However, such 'whole person' research may result in data-driven subgroups that are complex, difficult to interpret and challenging to recognise clinically. This paper describes a novel approach to applying statistical clustering techniques that may improve the clinical interpretability of derived subgroups and reduce sample size requirements. Methods: This approach involves clustering in two sequential stages. The first stage involves clustering within health domains and therefore requires creating as many clustering models as there are health domains in the available data. This first stage produces scoring patterns within each domain. The second stage involves clustering using the scoring patterns from each health domain (from the first stage) to identify subgroups across all domains. We illustrate this using chest pain data from the baseline presentation of 580 patients. Results: The new two-stage clustering resulted in two subgroups that approximated the classic textbook descriptions of musculoskeletal chest pain and atypical angina chest pain. The traditional single-stage clustering resulted in five clusters that were also clinically recognisable but displayed less distinct differences. Conclusions: In this paper, a new approach to using clustering techniques to identify clinically useful subgroups of patients is suggested. Research designs, statistical methods and outcome metrics suitable for performing that testing are also described. This approach has potential benefits but requires broad testing, in multiple patient samples, to determine its clinical value. The usefulness of the approach is likely to be context-specific, depending on the characteristics of the available data and the research question being asked of it

Springer - Publisher Connector

espace@Curtin

Binding interaction of a novel fluorophore with serum albumins: steady state fluorescence perturbation and molecular modeling analysis

Author: A Alam
A Alam
A Bhowmik
A Ray
AA Bhattacharya
B Banerji
BE Cohen
CA Lipinski
CA Royer
DS Rudra
GM Morris
GR Bickerton
HM Berman
IV Tetko
J Reichenwallner
JC Er
JR Simard
K Yamasaki
KA Majorek
M Banerjee
MD Hanwell
O Trott
OK Abou-Zied
S Curry
T Heyduk
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Public Library of Science (PLOS)

Challenges Predicting Ligand-Receptor Interactions of Promiscuous Proteins: The Nuclear Receptor PXR

Author: A Bender
A Biswas
A Khandelwal
AE Klon
AK Dunker
AK Ghose
B Blumberg
BL Urquhart
CR Yates
CY Ung
D Gupta
D Rogers
D Schuster
DG Teotico
DG Teotico
Erica J. Reschly
G Bertilsson
G Jones
IV Tetko
IV Tetko
J Feng
J Zhou
James M. Briggs
JE Chrencik
JT Metz
K Azzaoui
K Bachmann
K Yasuda
M Hassan
M Suarez
MA Lill
MA Lill
MA Lill
Manisha Iyer
Markus A. Lill
Matthew D. Krasowski
Matthew R. Redinbo
MD Krasowski
MD Krasowski
MN Jacobs
Nidhi
P Prathipati
RD Cramer
RE Watkins
RE Watkins
RE Watkins
S Ekins
S Ekins
S Ekins
S Ekins
S Ekins
S Ekins
S Kortagere
S Kortagere
S Kortagere
S Mani
S Verma
SA Kliewer
Sandhya Kortagere
Sean Ekins
W Mnif
WL DeLano
Y Xue
Y Xue
YD Gao
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Transcriptional regulation of some genes involved in xenobiotic detoxification and apoptosis is performed via the human pregnane X receptor (PXR) which in turn is activated by structurally diverse agonists including steroid hormones. Activation of PXR has the potential to initiate adverse effects, altering drug pharmacokinetics or perturbing physiological processes. Reliable computational prediction of PXR agonists would be valuable for pharmaceutical and toxicological research. There has been limited success with structure-based modeling approaches to predict human PXR activators. Slightly better success has been achieved with ligand-based modeling methods including quantitative structure-activity relationship (QSAR) analysis, pharmacophore modeling and machine learning. In this study, we present a comprehensive analysis focused on prediction of 115 steroids for ligand binding activity towards human PXR. Six crystal structures were used as templates for docking and ligand-based modeling approaches (two-, three-, four- and five-dimensional analyses). The best success at external prediction was achieved with 5D-QSAR. Bayesian models with FCFP_6 descriptors were validated after leaving a large percentage of the dataset out and using an external test set. Docking of ligands to the PXR structure co-crystallized with hyperforin had the best statistics for this method. Sulfated steroids (which are activators) were consistently predicted as non-activators while, poorly predicted steroids were docked in a reverse mode compared to 5α-androstan-3β-ol. Modeling of human PXR represents a complex challenge by virtue of the large, flexible ligand-binding cavity. This study emphasizes this aspect, illustrating modest success using the largest quantitative data set to date and multiple modeling approaches

CiteSeerX

Carolina Digital Repository

D-Scholarship@Pitt

Purdue E-Pubs

A new framework for sign language alphabet hand posture recognition using geometrical features through artificial neural network (part 1)

Author: A Anand
CC Chang
CW Hsu
D Janez
G Aquino
G Fang
GC Cawley
H Han
H Liang
H-S Yeo
I Elias
IV Tetko
J De Jesús Rubio
K Assaleh
K Crammer
K-J Yoon
N Bahman
N Duta
P Garg
P Kauff
P Kishore
PVV Kishore
Q-S Zhu
RKE Yo
RR Sanchez
S Shamshirband
SS Rautaray
T Fawcett
TG Dietterich
W Xiong
X Wu
Z Luca
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Hand pose tracking is essential in sign languages. An automatic recognition of performed hand signs facilitates a number of applications, especially for people with speech impairment to communication with normal people. This framework which is called ASLNN proposes a new hand posture recognition technique for the American sign language alphabet based on the neural network which works on the geometrical feature extraction of hands. A user’s hand is captured by a three-dimensional depth-based sensor camera; consequently, the hand is segmented according to the depth analysis features. The proposed system is called depth-based geometrical sign language recognition as named DGSLR. The DGSLR adopted in easier hand segmentation approach, which is further used in segmentation applications. The proposed geometrical feature extraction framework improves the accuracy of recognition due to unchangeable features against hand orientation compared to discrete cosine transform and moment invariant. The findings of the iterations demonstrate the combination of the extracted features resulted to improved accuracy rates. Then, an artificial neural network is used to drive desired outcomes. ASLNN is proficient to hand posture recognition and provides accuracy up to 96.78% which will be discussed on the additional paper of this authors in this journal

LJMU Research Online (Liverpool John Moores University)

Universiti Teknologi Malaysia Institutional Repository

Hydrophobicity and Charge Shape Cellular Metabolite Concentrations

Author: A Danchin
A Fersht
A Kummel
A Shrake
A Zarrinpar
AL Hopkins
Arren Bar-Even
Avi Flamholz
BD Bennett
BY Feng
BY Feng
CA Lipinski
DH Williams
E McCammick
Elad Noor
G Wachtershauser
I Nobeli
IA Berg
IV Tetko
J Bergstrom
J Seidler
J Thioulouse
JA Reynolds
JC Ewald
Jennifer L. Reed
Joerg M. Buescher
K Palm
KE Coan
L Wu
LC James
MA Oberhardt
N Ishii
NM O'Boyle
O Khersonsky
O Sínanoĝlu
P Stenberg
RJ Kleijn
RJ Williams
Ron Milo
SM Fendt
T Cheng
TJ Richmond
V Srinivasan
W Liebermeister
Y Benjamini
Publication venue: Public Library of Science
Publication date: 01/10/2011
Field of study

What governs the concentrations of metabolites within living cells? Beyond specific metabolic and enzymatic considerations, are there global trends that affect their values? We hypothesize that the physico-chemical properties of metabolites considerably affect their in-vivo concentrations. The recently achieved experimental capability to measure the concentrations of many metabolites simultaneously has made the testing of this hypothesis possible. Here, we analyze such recently available data sets of metabolite concentrations within E. coli, S. cerevisiae, B. subtilis and human. Overall, these data sets encompass more than twenty conditions, each containing dozens (28-108) of simultaneously measured metabolites. We test for correlations with various physico-chemical properties and find that the number of charged atoms, non-polar surface area, lipophilicity and solubility consistently correlate with concentration. In most data sets, a change in one of these properties elicits a ∼100 fold increase in metabolite concentrations. We find that the non-polar surface area and number of charged atoms account for almost half of the variation in concentrations in the most reliable and comprehensive data set. Analyzing specific groups of metabolites, such as amino-acids or phosphorylated nucleotides, reveals even a higher dependence of concentration on hydrophobicity. We suggest that these findings can be explained by evolutionary constraints imposed on metabolite concentrations and discuss possible selective pressures that can account for them. These include the reduction of solute leakage through the lipid membrane, avoidance of deleterious aggregates and reduction of non-specific hydrophobic binding. By highlighting the global constraints imposed on metabolic pathways, future research could shed light onto aspects of biochemical evolution and the chemical constraints that bound metabolic engineering efforts

Public Library of Science (PLOS)

Repository for Publications and Research Data

MPG.PuRe

CLUSS: Clustering of protein sequences based on a new similarity measure

Author: A Krause
Abdellali Kelil
AJ Enright
Alain Fleury
C Notredame
D Higgins
ELL Sonnhammer
F Titgemeyer
G Reinert
G Yona
H Lodish
IV Tetko
J Felsenstein
J Heringa
J Rocha
JD Thompson
JD Thompson
JH Ward
JH Ward
JS Varré
K Katoh
K Sjölander
K Sjölander
M Ike
M Kimura
MO Dayhoff
MY Leung
N Côté
N Wicker
P Pipenbacher
R Jothi
RC Edgar
RC Edgar
RO Duda
Ryszard Brzezinski
S Fanning
S Henikoff
S Karlin
S Karlin
S Karlin
S Vinga
SF Altschul
SF Altschul
Shengrui Wang
T Fukamizo
T Ishimizu
V Batagelj
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background The rapid burgeoning of available protein data makes the use of clustering within families of proteins increasingly important. The challenge is to identify subfamilies of evolutionarily related sequences. This identification reveals phylogenetic relationships, which provide prior knowledge to help researchers understand biological phenomena. A good evolutionary model is essential to achieve a clustering that reflects the biological reality, and an accurate estimate of protein sequence similarity is crucial to the building of such a model. Most existing algorithms estimate this similarity using techniques that are not necessarily biologically plausible, especially for hard-to-align sequences such as proteins with different domain structures, which cause many difficulties for the alignment-dependent algorithms. In this paper, we propose a novel similarity measure based on matching amino acid subsequences. This measure, named SMS for Substitution Matching Similarity, is especially designed for application to non-aligned protein sequences. It allows us to develop a new alignment-free algorithm, named CLUSS, for clustering protein families. To the best of our knowledge, this is the first alignment-free algorithm for clustering protein sequences. Unlike other clustering algorithms, CLUSS is effective on both alignable and non-alignable protein families. In the rest of the paper, we use the term "<it>phylogenetic</it>" in the sense of "<it>relatedness of biological functions</it>". Results To show the effectiveness of CLUSS, we performed an extensive clustering on COG database. To demonstrate its ability to deal with hard-to-align sequences, we tested it on the GH2 family. In addition, we carried out experimental comparisons of CLUSS with a variety of mainstream algorithms. These comparisons were made on hard-to-align and easy-to-align protein sequences. The results of these experiments show the superiority of CLUSS in yielding clusters of proteins with similar functional activity. Conclusion We have developed an effective method and tool for clustering protein sequences to meet the needs of biologists in terms of phylogenetic analysis and prediction of biological functions. Compared to existing clustering methods, CLUSS more accurately highlights the functional characteristics of the clustered families. It provides biologists with a new and plausible instrument for the analysis of protein sequences, especially those that cause problems for the alignment-dependent algorithms.</p

Springer - Publisher Connector

Public Library of Science (PLOS)

Novel Functional MAR Elements of Double Minute Chromosomes in Human Ovarian Cells Capable of Enhancing Gene Expression

Author: AI Spriggs
C Morales
C Peach
C Sreekantaiah
DR VanDevanter
E Gebhart
E Gebhart
F Kuttler
Feng Chen
GC Allen
GM Wahl
IV Tetko
J Bode
J Bode
J Mirkovitch
Jesusa Rosales
Jing Bai
JM Kim
JR McGill
K Alitalo
Ki-Young Lee
M Muleris
M Yoshimoto
MJ Singer
ML Slovak
N Neznanov
PA Girod
PJ Hahn
PN Cockerill
RP Martins
Songbin Fu
SS Fakharzadeh
TH Kwaks
VS Lestou
W Van Leeuwen
Wei Cao
Xin-yuan Guan
Xinying Ma
XY Guan
Y Adachi
Y Fan
Y Fukuda
Yan Jin
Yang Yu
Yihui Fan
Zheng Liu
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Double minute chromosomes or double minutes (DMs) are cytogenetic hallmarks of extrachromosomal genomic amplification and play a critical role in tumorigenesis. Amplified copies of oncogenes in DMs have been associated with increased growth and survival of cancer cells but DNA sequences in DMs which are mostly non-coding remain to be characterized. Following sequencing and bioinformatics analyses, we have found 5 novel matrix attachment regions (MARs) in a 682 kb DM in the human ovarian cancer cell line, UACC-1598. By electrophoretic mobility shift assay (EMSA), we determined that all 5 MARs interact with the nuclear matrix in vitro. Furthermore, qPCR analysis revealed that these MARs associate with the nuclear matrix in vivo, indicating that they are functional. Transfection of MARs constructs into human embryonic kidney 293T cells showed significant enhancement of gene expression as measured by luciferase assay, suggesting that the identified MARS, particularly MARs 1 to 4, regulate their target genes in vivo and are potentially involved in DM-mediated oncogene activation

CiteSeerX